New algorithms for learning and pruning oblique decision trees

نویسندگان

  • S. Shah
  • P. Shanti Sastry
چکیده

In this paper, we present methods for learning and pruning oblique decision trees. We propose a new function for evaluating different split rules at each node while growing the decision tree. Unlike the other evaluation functions currently used in literature (which are all based on some notion of purity of a node), this new evaluation function is based on the concept of degree of linear separability. We adopt a correlation-based optimization technique called the Alopex algorithm for finding the split rule that optimizes our evaluation function at each node. The algorithm we present here is applicable only for 2-class problems. Through empirical studies, we demonstrate that our algorithm learns good compact-decision trees. We suggest a representation scheme for oblique decision trees that makes explicit the fact that an oblique decision tree represents each class as a union of convex sets bounded by hyperplanes in the feature space. Using this representation, we present a new pruning technique. Unlike other pruning techniques, which generally replace heuristically selected subtrees of the original tree by leaves, our method can radically restructure the decision tree. Through empirical investigation, we demonstrate the effectiveness of our method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pruning Regression Trees with MDL

Pruning is a method for reducing the error and complexity of induced trees. There are several approaches to pruning decision trees, while regression trees have attracted less attention. We propose a method for pruning regression trees based on the sound foundations of the MDL principle. We develop coding schemes for various constructs and models in the leaves and empirically test the new method...

متن کامل

Decision Forests with Oblique Decision Trees

Ensemble learning schemes have shown impressive increases in prediction accuracy over single model schemes. We introduce a new decision forest learning scheme, whose base learners are Minimum Message Length (MML) oblique decision trees. Unlike other tree inference algorithms,MMLoblique decision tree learning does not over-grow the inferred trees. The resultant trees thus tend to be shallow and ...

متن کامل

C 5 . 1 . 3 Decision Tree Discovery

We describe the two most commonly used systems for induction of decision trees for classi cation: C4.5 and CART. We highlight the methods and di erent decisions made in each system with respect to splitting criteria, pruning, noise handling, and other di erentiating features. We describe how rules can be derived from decision trees and point to some di erence in the induction of regression tree...

متن کامل

Pruning Decision Trees and Lists

Machine learning algorithms are techniques that automatically build models describing the structure at the heart of a set of data. Ideally, such models can be used to predict properties of future data points and people can use them to analyze the domain from which the data originates. Decision trees and lists are potentially powerful predictors and embody an explicit representation of the struc...

متن کامل

A framework for bottom-up induction of oblique decision trees

Decision-tree induction algorithms are widely used in knowledge discovery and data mining, specially in scenarios where model comprehensibility is desired. A variation of the traditional univariate approach is the so-called oblique decision tree, which allows multivariate tests in its non-terminal nodes. Oblique decision trees can model decision boundaries that are oblique to the attribute axes...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Systems, Man, and Cybernetics, Part C

دوره 29  شماره 

صفحات  -

تاریخ انتشار 1999